Skip to content

[FIX] test_hymba#2872

Merged
Qubitium merged 5 commits into
mainfrom
zx_fix_hymba
May 11, 2026
Merged

[FIX] test_hymba#2872
Qubitium merged 5 commits into
mainfrom
zx_fix_hymba

Conversation

@ZX-ModelCloud
Copy link
Copy Markdown
Collaborator

Summary

Fix test_hymba

What Changed

1.shared_kv_cache_dict was only populated when reuse_kv=True.

That breaks models like Hymba where not every decoder layer has reuse_kv=True: earlier layers may need to publish KV for later layers even when the current layer itself does not consume kv_last_layer. In those cases,
prev_kv stays empty by the time a later layer actually needs it.

This change adds a model-level write_shared_kv_cache switch on BaseQModel, keeps the default behavior unchanged, and enables it for Hymba.

2.hymba is compatible with transformers v5.

2. hymba is compatible with Transformer v5.

Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
Comment thread gptqmodel/utils/hf.py Fixed
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
@Qubitium Qubitium merged commit 8cf7ed7 into main May 11, 2026
6 checks passed
@Qubitium Qubitium deleted the zx_fix_hymba branch May 11, 2026 01:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants